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Abstract 

We study the average distortion introduced by scalar, vector, and entropy coded quantization of compressive sensing (CS) 
measurements. The asymptotic behavior of the underlying quantization schemes is either quantified exactly or characterized via 
bounds. We adapt two benchmark CS reconstruction algorithms to accommodate quantization errors, and empirically demonstrate 
that these methods significantly reduce the reconstruction distortion when compared to standard CS techniques. 

& ■ 

I. Introduction 

Compressive sensing (CS) is a linear sampling method that converts unknown input signals, embedded in a high dimensional 
r***" . space, into signals that lie in a space of significantly smaller dimension. In general, it is not possible to uniquely recover an 
unknown signal using measurements of reduced-dimensionality. Nevertheless, if the input signal is sufficiently sparse, exact 
reconstruction is possible. In this context, assume that the unknown signal x € M. N is if -sparse, i.e., that there are at most K 
' nonzero entries in x. A naive reconstruction method is to search among all possible signals and find the sparsest one which 
i ^ i ' is consistent with the linear measurements. This method requires only m — 2K random linear measurements, but finding the 
1 sparsest signal representation is an NP-hard problem. On the other hand, Donoho and Candes et. al. demonstrated in [l]-[4] 

■ that sparse signal reconstruction is a polynomial time problem if more measurements are taken. This is achieved by casting the 
, reconstruction problem as a linear programming problem and solving it using the basis pursuit (BP) method. More recently, 

[~^. , the authors proposed the subspace pursuit (SP) algorithm in [5] (see also the independent work [6] for a closely related 
| approach). The computational complexity of the SP algorithm is linear in the signal dimension, and the required number of 

' linear measurements is of the same order as that for the BP method. 

O ' 

For most practical applications, it is reasonable to assume that the measurements are quantized and therefore do not have 
' infinite precision. When the quantization error is bounded and known in advance, upper bounds on the reconstruction distortion 
were derived for the BP method in [7] and the SP algorithm in [5], [6], respectively. For bounded compressible signals, which 

■ have transform coefficients with magnitudes that decay according to a power law, an upper bound on the reconstruction 
distortion introduced by a uniform quantizer was derived in [8]. The same quantizer was studied in [9] for exactly if-sparse 
signals and it was shown that a large fraction of quantization regions is not used [9]. All of the above approaches focus on 
the worst case analysis, or simple one -bit quantization [10]. An exception includes the overview paper [11], which focuses on 
the average performance of uniform quantizers, assuming that the support set of the sparse signal is available at the quantizer. 

As opposed to the worst case analysis, we consider the average distortion introduced by quantization. We study the asymptotic 
distortion rate functions for scalar quantization, entropy coded scalar quantization, and vector quantization of the measurement 
vectors. Exact asymptotic distortion rate functions are derived for scalar quantization when both the measurement matrix and 
the sparse signals obey a certain probabilistic model. Lower and upper bounds on the asymptotic distortion rate functions are 
also derived for other quantization scenarios, and the problem of compressive sensing matrix quantization is briefly discussed 
as well. In addition, two benchmark CS reconstruction algorithms are adapted to accommodate quantization errors. Simulations 
show that the new algorithms offer significant performance improvement over classical CS reconstruction techniques that do 
not take quantization errors into consideration. 

This paper is organized as follows. Section II contains a brief overview of CS theory, the BP and SP reconstruction 
algorithms, and various quantization techniques. In Section III, we analyze the CS distortion rate function and examine the 

*Part of the material in this paper was submitted to the IEEE Information Theory Workshop (ITW), 2009, and the IEEE International Symposium on 
Information Theory (ISIT), 2009. 
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influence of quantization errors on the BP and SP reconstruction algorithms. In Section IV, we describe two modifications of 
the aforementioned algorithms, suitable for quantized data, that offer significant performance improvements when compared 
to standard BP and SP techniques. Simulation results are presented in Section V. 

II. Preliminaries 

A. Compressive Sensing (CS) 

In CS, one encodes a signal x of dimension N by computing a measurement vector y of dimension of m <C N via linear 
projections, i.e., 

y = *x, 

where $ g M. mxN is referred to as the measurement matrix. In this paper, we assume that x e is exactly if -sparse, i.e., 
that there are exactly K entries of x that are nonzero. The reconstruction problem is to recover x given y and 
The BP method is a technique that casts the reconstruction problem as a li -regularized optimization problem, i.e., 

min || 3c|| x subject to y = *x, (1) 

where Hx^ = X^2=i l x «l denotes the Zi-norm of the vector x. It is a convex optimization problem and can be solved efficiently 
by linear programming techniques. The reconstruction complexity equals O (m 2 7V 3 / 2 ) if the convex optimization problem is 
solved using interior point methods [12]. 

The computational complexity of CS reconstruction can be further reduced by the SP algorithm, recently proposed by two 
research groups [5], [6]. It is an iterative algorithm drawing on the theory of list decoding. The computational complexity of 
this algorithm is upper bounded by O (Km{N + if 2 )), which is significantly smaller than the complexity of the BP method 
whenever if <C N. See [5] for a detailed performance and complexity analysis of this greedy algorithm. 

A sufficient condition for both the BP and SP algorithms to perform exact reconstruction is based on the so called restricted 
isometry property (RIP) [2], formally defined as follows. 

Definition 1: (RIP). A matrix <fr € R mxAr is said to satisfy the Restricted Isometry Property (RIP) with coefficients (if, 5) 
for if < m, < 6 < 1, if for all index sets I C {1, • • • , N} such that |i| < if and for all q e R m , one has 

(l-5)||q||^<||*jq||^<(l + J)||q||^. 
The RIP parameter 5k is defined as the infimum of all parameters 5 for which the RIP holds, i.e., 

5 K := inf {S : (1 - 5) \\qf 2 < ||*/q||^ < (1 + 5) \\qf 2 , 

V|/| < K, Vqe Ml J l} . (2) 

It was shown in [5], [7] that both BP and SP algorithms lead to exact reconstructions of if-sparse signals if the matrix $ 
satisfies the RIP with a constant parameter, i.e., 5 Ci k < c where both c\ € M + and c € (0, 1) are constants independent 
of K (although different algorithms may have different parameters cos and c\s). Most known families of matrices satisfying 
the RIP property with optimal or near-optimal performance guarantees are random, including Gaussian random matrices with 
i.i.d. Af(0, 1/m) entries, where m > O (if log /V). 

For completeness, we briefly describe the SP algorithm. For an index set T C {1,2, •• • ,7Y}, let <&t be the "truncated 
matrix" consisting of the columns of <I> indexed by T, and let span ($t) denote the subspace in R m spanned by the columns 
of <I?t- Suppose that <&^<J?t is invertible. For any given y € R m , the projection of y onto span($T) is defined as 

Y P - proj (y, *r) := *t (*t*t) _1 *tY> (3) 

where 3>* denotes the conjugate transpose of <!?. 

The corresponding projection residue vector y r and projection coefficient vector x p are defined as 

y r = resid(y, * T ) :=y-y P , (4) 
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and 

x p = pcocff (y, * T ) := (Q^^y 1 $* T y. (5) 
The steps of the SP algorithm are summarized below. 



Algorithm 1 The Subspace Pursuit (SP) Algorithm 
Input: K, <I>, y 

Initialization: Let T° = {K indices corresponding to entries of largest magnitude in <f?*y} and y° = rcsid (y, &f ). 
Iteration: At the I th iteration, go through the following steps. 

1) T e — T 1 ^ 1 \J{K indices corresponding to entries of largest magnitude in <l>*y^ _1 }. 

2) Let x p = pcocff (y, $>f e ) and T l = {K indices corresponding to entries of largest magnitude in x p }. 

3) y e r = resid(y, * T <). 

4) If \\y e r || 2 > Hy^" 1 !^, let T e = T^ 1 and quit the iteration. 

Output: The vector x satisfying x^ ... t ^}-T e = an d *-T l = pcocff (y, & T e). 



In what follows, we study the performance of the SP and BP reconstruction algorithms when the measurements are subjected 
to three different quantization schemes. We also discuss the issue of quantizing the measurement matrix values. 

B. Scalar and Vector Quantization 

Let C C R m be a finite discrete set, referred to as a codebook. A quantizer is a mapping from R m to the codebook C with 
the property that 



ynueC if ye TZ U 



(6) 



where u> is referred to as a level and is the quantization region corresponding to the level u). The performance of a 
quantizer is often described by its distortion-rate function, defined as follows. Let the distortion measure be the squared 
Euclidean distance (i.e., mean squared error (MSE)). For a random source Y e W n , the distortion associated with a quantizer 
q is D q := E || Y — q (Y) || 2 ■ For a given codebook C, the optimal quantization function that minimizes the Euclidean 
distortion measure is given by 



q* (Y) = arg min ||Y — w|| 2 . 
u>ec 



As a result, the corresponding quantization region is given by 



:={yeR m : ||y - uf 2 < \\y - u'\\l , Vw' e c} 



(7) 



and the distortion associated with this codebook C equals 

D{C) := E 



I Y - q* (Y) 



Let R := log 2 \C\ be the rate of the codebook C. For a given code rate R, the distortion rate function is given by 



D* (R) := inf D (C) . (8) 

C:±log 2 \C\<R 

For simplicity, assume that the random source Y does not have mass points, and that the levels in the quantization codebook 
are all distinct. With these assumptions, though different quantization regions (7) may overlap, the ties can be broken arbitrarily 
as they happen with probability zero. 

We study both vector quantization and scalar quantization. Scalar quantization has lower computational complexity than 
vector quantization. It is a special case of vector quantization when m = 1. To distinguish the two schemes, we use the 
subscripts SQ and VQ to refer to scalar and vector quantization, respectively. For quantized compressive sensing, we assume 
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that the quantization functions for all the coordinate of Y are the same. The corresponding distortion rate function is therefore 
of the form 

m 

|2 



D* so (R) := inf E Y 



i=l 



(9) 



Necessary conditions for optimal scalar quantizer design can be found in [13]. The quantization region for the level e C, 
,2 R , can be written in the form TZ LJi = where € R 1J {— oo} (J {+00} and is the 



i = 1,2, 



closure of the open interval An optimal quantizer satisfies the following conditions: 

1) If the optimal quantizer has levels and Wj, then the threshold that minimizes the mean square error (MSE) is 



1 , 



2) If the optimal quantizer has thresholds and tj, then the level that minimizes the MSE is 

uji = E 



Y\Y e {U-uU) 



(10) 



(11) 



Lloyd's algorithm [13] for quantizer codebook design is based on the above necessary conditions. Lloyd's algorithm starts 
with an initial codebook, and then in each iteration, computes the thresholds t^s according to (10) and updates the codebook 
via (11). Although Lloyd's algorithm is not guaranteed to find a global optimum for the quantization regions, it produces 
locally optimal codebooks. 

As a low-complexity alternative to non-uniform quantizers, uniform scalar quantizers are widely used in practice. A uniform 
scalar quantizer is associated with a "uniform codebook" C u ,sq = {^1 < ^2 < • • • < &m} , for which uji — uii-i = Uj — 
for all 1 < i 7^ j < M. The difference between adjacent levels is often referred to as the step size, and denoted by A Ui sq. 
The corresponding distortion rate function is given by 



D 



u,SQ 



(R) :- 



inf 

Cu.SQ: log 2 \C u , SQ \<R 



Ey 



(12) 



where Csq in (9) is replaced by C u ^sq- 

Definitions (9) and (12) are consistent with (8) as a Cartesian product of scalar quantizers can be viewed as a special form 
of a vector quantizer. 

III. Distortion Analysis 

We analyze the asymptotic behavior of the distortion rate functions introduced in the previous section. We assume that the 
quantization codebook C, for both scalar and vector quantization, is designed offline and fixed when the measurements are 
taken. 



A. Distortion of Scalar Quantization 

For scalar quantization, we consider the following two CS scenarios. 

Assumptions I: 

1) Let $ = -4= A e M mxAr , where the entries of A are i.i.d. Subgaussian random variables 1 with zero mean and unit 
variance. 

'A random variable X is said to be Subgaussian if there exist positive constants ci and C2 such that 

Pr (|X| > x) < c ie -^ x2 Mx > 0. 

One property of Subgaussian distributions is that they have a well defined moment generating function. Note that the Gaussian and Bernoulli distributions 
are special cases of the Subgaussian distribution. 
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2) Let X e M. N be an exactly if-sparse vector, that is, a signal that has exactly K nonzero entries. We assume that the 
nonzero entries of X are i.i.d. Subgaussian random variables with zero mean and unit variance, although more general 
models can be analyzed in a similar manner. 

Assumptions II: Assume that X e R n is exactly X-sparse, and that the nonzero entries of X are i.i.d. standard Gaussian 
random variables. 

The asymptotic distortion-rate function of the measurement vector under the first CS scenario is characterized in Theorem 

1. 

Theorem 1: Suppose that Assumptions I hold. Then 

iJ^oo(if,m"iv)->cx) K 

and 



Urn _ lim — D* SQ (R) = (13) 



2 2R 4 

R lim ,u lim , kH^Q W = * ln2 ' (14) 

R^ca(K,m,N)~^oo Ait O 

The proof is based on the fact that the distributions of ■J^Yh 1 < i < m, weakly converge to standard Gaussian distributions. 
The detailed description is given in Appendix A. 

To study the scenario described by Assumptions II, we need the following definitions. For a given matrix <&, let 

■= jf Yl Vlv < 15 > 

ie[m],je[N] 

and 



(12-.= ^.^E^iJ. ( 16 > 

where [m] — {1,2, •• • , m} and ( j^) denotes the set of all subsets of [N] with cardinality K. Note that if the matrix <I> 
is generated from the random ensemble described in Assumption 1.1), then fii e (1 — e, 1 + e) with high probability, for all 
e > 0, and whenever m and N are sufficiently large. It is straightforward to verify that \i\ < ii 2 - 

With these definitions at hand, bounds on the distortion rate function can be described as below. 

Theorem 2: Suppose that Assumption II holds. Then 

ttV3 2 2R 



^ iim sur 

and 



< l im sup — DJ g (i?) < ^ M 2, (17) 



4 In 2 2 2R 

—^^TJ^kr^W- (18) 

The detailed proof is postponed to Appendix B. Here, we sketch the basic ideas behind the proof. In order to construct 
a lower bound, suppose that one has prior information about the support set T before taking the measurements. For a given 
value of i and for a given T, we calculate the corresponding asymptotic distortion-rate function. The lower bound is obtained 
by taking the average of these distortion-rate functions over all possible values of i and T. For the upper bound, we design a 
sequence of sub-optimal scalar quantizers, then apply them to all measurement components, and finally construct a uniform 
upper bound on their asymptotic distortion-rate functions, valid for all i and T. The uniform upper bound is given in (17). 

Remark 1: Our results are based on the fundamental assumption that the sparsity level K is known in advance and that the 
statistics of the sparse vector x is specified. Very frequently, however, this is not the case in practice. If we relax Assumptions 
I and II further by assuming that K is sufficiently large, it will often be the case that the statistics of the measurement Y, is 
well approximated by a Gaussian distribution. Here, note that different variables may have different variances and these 
variances are generally unknown in advance. The problem of statistical mismatch has been analyzed in the proof of the upper 
bound (17) (see Proposition 1 of Appendix B for details). In particular, non-uniform quantization with slightly over-estimated 
variance performs better than that with under-estimated variance [14, Chapter 8.6]. 
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According to Theorem 1, if the quantization rate R is sufficiently large, the distortion of the optimal non-uniform quantizer is 
approximately only 1/R of that of the optimal uniform quantizer. This gap can be closed by using entropy coding techniques 
in conjunction with uniform quantizers. 

B. Uniform Scalar Quantization with Entropy Encoding 

Let B eri c = {^1,^2, • • • , %} be a binary codebook, where the codewords vu 1 < i < M, are finite-length strings over the 
binary field with elements {0, 1}. The codebook B enc can, in general, contain codewords of variable length - i.e., the lengths 
of different codewords are allowed to be different. Let ii be the length of codeword V{, i = 1, 2, • • • , M. Then Vi e {0, l} £lXl . 
For a given quantization codebook C = {o>i,cj2, • ■ ■ ,%}, the encoding function f enc is a mapping from the quantization 
codebook C to the binary codebook B enc , i.e., f enc (uj) = v e B enc . The extension f* nc is a mapping from finite length strings 
of C to finite length strings of B enc (a concatenation of the corresponding binary codewords): 

fenc ( UJ ii UJ i2 ' ' ' w i s ) = fenc (WjJ fenc (^>i 2 ) ' ' ' fenc (Wj, ) . 

The code Z? enc is called uniquely decodable if any concatenation of binary codewords v il v i2 ■ ■ ■ v is has only one possible 
preimage string ujj 1 ^j 2 ■ ■ ■ u>j e producing it. In practice, the code B enc is often chosen to be a prefix code, that is, no codeword 
is a prefix of any other codeword. A prefix code can be uniquely decoded as the end of a codeword is immediately recognizable 
without checking future encoded bits. 

We consider the case in which scalar quantization is followed by variable-length encoding. The corresponding expected 
encoding length L is defined by 

L = E y [£ o f enc o q SQ (Y)} , 

where £ (v) outputs the length of the encoding codeword v e B enc . The goal is to jointly design q S Q and f enc to minimize 
the expected encoding length L. We are interested in the distortion rate function defined by 



D* enc (R) := inf E Y 

L<R 

Theorem 3: Suppose that Assumptions I hold. Then 



(19) 



7re 2 2R 

— <liminf liminf ——D* r (R) 

2 2R ire 
< limsup limsup —D* enc (#)<—, 



and the upper bound is achieved by a uniform scalar quantizer with 



lim lim J 2 K A„ so = L 

followed by Huffmann encoding. 

Proof: Given a quantization function, Huffmann encoding gives an optimal prefix code that minimizes L [15, Chapter 
5]. Let pi = Pr (Y : q (Y) = Wj) and let £i be the length of encoded codeword f enc (ui). Let H := J^fti l°g2Pi- Then 
H < L — J2iPi^i — H + 1. In addition, it is well known that the distortion of scalar quantization of a Gaussian source is 
lower bounded by -^2 2(h ^ H '> (1 + o H (1)), where h denotes the differential entropy of the source, and the lower bound is 
achieved by a uniform quantizer. Calculating h and interpreting H as a function of L establish the claimed result. ■ 
As expected, for a given average description length, the average distortion of uniform scalar quantization and Huffmann 
encoding is smaller than that of an optimal scalar quantizer with fixed length encoding. 



C. Distortion of Vector Quantization 

For the purpose of analyzing vector quantization schemes, we make the following assumptions. 
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Assumptions III: 

1) Let $ e R mxJV be a matrix satisfying the RIP with parameter S K € (0, 1). 

2) Assume that X G R" is exactly A-sparse, and that the nonzero entries of X are i.i.d. standard Gaussian random variables. 
Theorem 4: Suppose that Assumptions III hold. Then 

nlRm/K 

(1 - 5 K ) (1 + o K (1)) < lim inf ^D VQ (R) (20) 

2 2R 

< limsup D* VQ (R) < (1 + (1 + o m (1)) , (21) 

where o K (1) o and o m (1) 0. Another upper bound on DyQ (i?) is given by 

2 2R ir\/3 
limsup— D* VQ (R) < -1-H2, (22) 

where ^ 2 is as defined in (16). 

Remark 2: The comparison of the two upper bounds in (21) and (22) depends on the ratio between m and K. Consider the 
case where N — (3K, m — 9 (A' log (N/K)) = aK for some a, (3 > 1. The first upper bound becomes 

2 2R 

lim sup— -DyQ (R) < a (1 + fo) (1 + o m (1)) . 

7?.^oo li- 
lt is smaller than the second upper bound if and only if 

7T\/3 

<>JC < ^ — M2 - 1- 
2a 

The upper bound (22) is obtained by using the Cartesian product of scalar quantizers and invoking the result in (17). The 
bounds (20) and (21) are proved in Appendix B. The basic ideas behind the proof are similar to those used for proving Theorem 
2: the lower bound is obtained by averaging the distortions of optimal quantizers for every T £ ( ). while the upper bound 
is a uniform upper bound on the distortions of quantizers constructed for all T € ( ^?). 

Note that the lower bound in (20) is not achievable when K < m. The upper bounds (21) and (22) do not guarantee significant 
distortion reduction of vector quantization compared with scalar quantization. Due to their inherently high computational 
complexity, vector quantizers do not offer clear advantages that justify their use in practice. 



D. CS Measurement Matrix Quantization Effects 

In CS theory, the measurement matrix is generated either randomly or by some deterministic construction. Examples include 
Gaussian random matrices and the deterministic construction based on Vandermonde matrices [16], [17]. In both examples, 
the matrix entries typically have infinite precision, which is not the case in practice. It is therefore also plausible to study the 
effect of quantization of CS measurement matrix. 

Consider Assumption I where the measurement matrix is randomly generated. Let us assume that every entry <pi,j, 1 < i < m 
and 1 < j < N, is quantized using a finite number of bits. Note that iptj = q (tpij) is a bounded random variable and therefore 
Subgaussian distributed. The results in Theorem 1 are therefore automatically valid for quantized matrices as well. 

Suppose that the measurement matrix is constructed deterministically and then quantized using a finite number of bits. The 
parameters fj,i, [i 2 and 5 K of the quantized measurement matrix can be computed according to (15), (16) and (2), respectively. 
The results regarding scalar quantization and vector quantization described in Theorems 2 and 4 can be easily seen to hold in 
this case as well. 



E. Reconstruction Distortion 

Based on the results of the previous section, we are ready to quantify the reconstruction distortion of BP and SP methods 
introduced by quantization error. 
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It is well known from CS literature that the reconstruction distortion is dependent on the distortion in the measurements. 
Consider the quantized CS given by 

Y = q (Y) = <J?X + E, 

and where E g R m denotes the quantization error. Let X be the reconstructed signal based on the quantized measurements 
Y. Then the reconstruction distortion can be upper bounded by 



X X 



<c 2 ||E| 



2 ' 



(23) 



where the constant c differs for different reconstruction algorithms. The best bounding constant for the BP method was given 
in [7], and equals 

4 



V3 - 354K 

while for the SP algorithm, the constant was estimated in [5] 



VTTI 



4K 



1 + 4 



3 if 



J 3K 



SP $3K (1 — <>3if ) 

A lower bound on the reconstruction distortion is given as follows. Suppose that the support set T of the sparse signal x is 
perfectly reconstructed. The reconstructed signal X is given by 

X = (*£* T ) _1 * T Y, 

and the reconstruction distortion is lower bounded by 



X X 



> 



1 + 6 



For short, let 



K 



Clb 



Y-Y 



2 (1 + 6k) 



Sk iitpii 2 

2 11^112 ' 



(24) 



1 + S K ' 

Combining the bounds (23,24) and the results in Theorems 1-4, we summarize the asymptotic bounds on the reconstruction 
distortion as follows. Under Assumptions I, the reconstruction distortion of scalar quantization is bounded by 

c Ih — - — < hm lim — —hj 

2 R-*oo(K,m,N)^oo K 







■2 




X X 








to 



c 



< 



2 irV3 
sp 2 



for subspace algorithm 



c fep 2L 2^ ^ or basis pursuit algorithm 







2 




X X 








2 



and the reconstruction distortion of uniform scalar quantization is bounded by 

2 41og2 2 2 « 
Cih — ^ — < ™ hm ttt:L 

3 R^oo(K,m,N)^ooKR 

c 2 sp 4 1 g S 2 for subspace algorithm 
c? 41 ° g2 for basis pursuit algorithm 

Suppose that Assumption II holds. The reconstruction distortions for scalar quantization and uniform scalar quantization are 



< 
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respectively bounded by 



2 ^ ^ ,. • f 2 ; w 
cf b — — /Xi < hmmi— TT-rL 



22_R 
. A 

2 2R 







2" 




X-X 








2 



< lim sup E 



X X 



< 



c sp IL 2^^2 for subspace algorithm 



„2 7iV3 
^6p 2 



it 2 for basis pursuit algorithm 



and 



2 4 log 2 2 2K 
c lb —m < liminf— E 



X X 



Given the encoding rate i? per measurement, the reconstruction distortion of the optimal scalar quantizer is bounded as 



c ib — < liminf liminf 



6 



2 2R 

2 2R 



< lim sup lim sup 

R^oo (K,m, JV)— >oo 



K 



X X 



X X 



< 



2 Tre 

"sp 3 

C 6p 3 



for subspace algorithm 
for basis pursuit algorithm 



The bounds for reconstruction distortion associated with vector quantization are given by 

4 (1-Sk) (1 + o*(1)) 

X X 

2" 



22Rm/K 

< liminf E 

R^oo K 



2 2R 

< lim sup E 



X X 



< 



c 2 . (1 + 6k) (1 + o m (1)) for subspace algorithm 



c 2 p (1 + 5k ) (1 + o TO (1)) for basis pursuit algorithm 



and 



2 2R 

lim sup— — E 

R^oo K 



X X 



< 



c sp ZL 2^A*2 for subspace algorithm 

„2 Try^, 



L. bp — 2^M2 f° r basis pursuit algorithm 

It is worth noting that the upper bound (23) on reconstruction distortion may not be tight. Empirical experiments show that 
this upper bound often significantly over-estimates the reconstruction distortion [5], [7]. 



IV. Reconstruction Algorithms for Quantized CS 
We present next modifications of BP and SP algorithms that take into account quantization effects. 

To describe these algorithms, we find the following notation useful. Let Y be the quantized measurement vector. Given a 
vector Y, the corresponding quantization region can be easily identified: the quantization region of vector quantization 7£. Y 
is defined in (7); that of scalar quantization is given by the Cartesian product of the quantization regions for each coordinate, 
i.e., 7?-y — YYIL\ H-y where IZy is the quantization region of 

Similar to the standard BP method, the reconstruction problem can be now casted as 



minHxIlj subject to $x e 7^y- 



(25) 
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It can be verified that IZ^ is a closed convex set and therefore (25) is a convex optimization problem and can be efficiently 
solved by linear programming techniques. 

In order to adapt the SP algorithm to the quantization scenario at hand, we describe first a geometric interpretation of 
the projection operation in the SP algorithm. Given y S R m and <&t <= R" IX ' T ', suppose that &t has full column rank, in 
other words, suppose that the columns of <f>T are linearly independent. The projection operation in (3) is equivalent to the 
optimization problem 

min v-$txL (26) 

Let x* be the solution of the quadratic optimization problem (26). Then functions (3-5) are equivalent to proj (y, <& T ) = «&t x *, 
resid (y, $t) = y — 3>tx* and pcoeff (y, <fr T ) = x*. 

The modified SP algorithm is based on the above geometric interpretation. More precisely, we use the following definition. 

Definition 2: For given * T e M mx l T l, Y and TZ^, define 

Q:= {(x,y) ER m x TZ^ : 

||y-*rx|| 2 < ||y'-* r x'|| 2 V (x', y') e M |T| x 7^} , (27) 

and 

(x, y) = argmin y — Y . (28) 

(x,y)£Q 2 

It can be verified that the pair (x, y) is well defined. See Appendix C for details. 

This definition is introduced to identify the best approximation for Y among multiple points in TZy- that minimize ||y — $ T x|| 2 . 
Based on this definition, we replace the resid and pcoeff functions in Algorithm 1 with new functions 

resid (9) (Y, * t ) := y - *tx 

and 

pcoeff {q) (Y, * t ) := x, 

where the superscript (q) emphasizes that these definitions are for the quantized case. This gives the modified SP algorithm. 
The advantage of the modified algorithms are verified by the simulation results presented in the next section. 



V. Empirical Results 

We performed extensive computer simulations in order to compare the performance of different quantizers and different 
reconstruction algorithms empirically. The parameters used in our simulations are m = 128, N = 256 and K = 6. Given these 
parameters, we generated realizations of m x N sampling matrices from the i.i.d. standard Gaussian ensemble and normalize 
the columns to have unit ^-norm. We also selected a support set T of size \T\ = K uniformly at random, generated the entries 
supported by T from the standard i.i.d. Gaussian distribution and set all other entries to zero. We let the quantization rates vary 
from two to six bits. For each quantization rate, we used Lloyd's algorithm (Section II-B) to obtain a nonuniform quantizer and 
employed brute-force search to find the optimal uniform quantizer. To test different quantizers and reconstruction algorithms, 
we randomly generated * and x independently a thousand times. For each realization, we calculated the measurements Y, 
the quantized measurements Y and the reconstructed signal X. 

Fig. 1 compares uniform and uniform quantizers with respect to measurement distortion. Though the quantization rates in 
our experiments are relatively small, the simulation results are consistent with the asymptotic results in Theorem 1 : nonuniform 
quantization is better than uniform quantization and the gain increases with the quantization rate. Fig. 2a compares the 
reconstruction distortion of the standard BP and SP algorithms. The comparison of the modified algorithms is given in Fig. 2. The 
modified algorithms reduce the reconstruction distortion significantly. When the quantization rate is six bits, the reconstruction 
distortion of the modified algorithms is roughly one tenth of that of the standard algorithms. Furthermore, for both the standard 
and modified algorithms, the reconstruction distortion given by SP algorithms is much smaller than that of BP methods. 
Note that the computational complexity of the SP algorithms is also smaller than that of the BP methods, which shows 
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clear advantages for using SP algorithms in conjunction with quantized CS data. An interesting phenomenon occurs for the 
case of the modified BP method: although nonuniform quantization gives smaller measurement distortion, the corresponding 
reconstruction distortion is actually slightly larger than that of uniform quantization. We do not have solid analytical arguments 
to completely explain this somewhat counter-intuitive fact. 



Appendix 



A. Proof of Theorem 1 

Let T = {1 < j < N : X, ^ 0} be the support set of x, i.e., x t ^ for all i e T and Xj = for all j g T. It is easy to 
show that for all 1 < i < m and T c {1, • • • , N} such that |T| = K, 



E 



= 



and 



E 



= K. 



According to the Central Limit Theorem, the distribution of J2j eT converges weakly to the standard Gaussian 

distribution as K — > oo. This can be verified by the facts that AijXjS are independent and identically distributed, and that 
the moment generating function of AijXj is well defined. As a result, the distribution of y^Yi converges weakly to the 
standard Gaussian distribution as K,m, N —> oo. 

We apply a scalar quantizer with 2 R levels to the random variable y^Yi. In this case, one has 

l|2" 



Y — Y 



1 m 



E 



mK 

-. VTi 

-Ve 

•11 ^— ' 



.i=l 



m 



i=l 



Yi 



Y, 



E 



Y 



Y, 



(29) 



where the last line represents the distortion of quantizing yj^Yi. Note that the distortion-rate function for scalar quantization 
of a Gaussian random variable is given by 



lim 2 2H D* (R) 



(30) 



where a 2 is the variance of the underlying Gaussian source (see [18] for a detailed proof of this result). We then have 



2 2R TTx/^ 

lim lim D* (R) = lim 2 2R D* (R) = —, 



which completes the proof of (13). 

Consider a uniform quantizer with codebook C u , such that \C U \ — 2 R , and apply the corresponding uniform quantizer to 
the random variable y/jfYi. It was shown in [19] that the distortion-rate function of uniform scalar quantizers of a Gaussian 
random variable equals 

i . 

(31) 



2 2R 4 



It is clear that 



2 2R 2 2R 4 

R-^ca (K,m, N)^oo K R R^ca H O 



This proves Theorem 1. 
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B. Proof of Theorems 2 and 4 

For completeness, let us first briefly review the key results used for deriving the asymptotic distortion-rate function for CS 
vector quantization. Suppose the source Y€l* has probability density function / (y). Let 1Z C K fc be a quantization region 
and weCbe the corresponding quantization level. The corresponding normalized moment of inertia (NMI) is defined as 

(kdy) 1+2/k 

The optimal NMI equals 

ml = inf m (1Z) , 

k newt" 

only depends on the number of dimensions: ml = Ck with Ck — yj when k = 1 and Ck — * when k — > oo. Thus the 
distortion rate function satisfies 

2 R f f (v) 

lim — D (J?) = / -i^J- m * k dy, (32) 
fl^oo k J A 2/fe (y) 

where R is the quantization rate per dimension, and (y) denotes the point density function. In this case, the integral 

Afe (y) dy 



IM 

gives the fraction of quantization levels belonging to M. for all measurable sets M C For simplicity, we have assumed 
that A/c (y) is continuous on M fe . For fixed m* k , the problem of designing an asymptotically optimal quantizer can be reduced to 
the problem of finding the point density function X* k (y) that minimizes (32). By Holder's inequality, the optimal point density 
function is given by 

VM _ / fc/(fc+2) (y) 
k[y) j / fe /( fe + 2 >(y) • dy' 

and the asymptotic distortion rate function is therefore 

K^jjD* (R) = c k (/ / fe /( fe + 2 ) (y) • dy) * . (33) 

If the source Y is Gaussian distributed with covariance matrix S > 0, then the asymptotic distortion rate function (33) can 
be explicitly evaluated as 

lim —£>*(£) = Cfc|27rE|M—^) (34) 



i( ■ x k \ k 

= |E|*(1 + *(1)), 

fc + 2 

where ok (1) — > as K — > oo, and the last equality follows from the fact that — > an d (^^) 2 — > e as /c ^ oo. 
We present next the key results used for proving the upper bounds in (17) and (21). 

Proposition 1: Let Y € R k be a Gaussian random vector with zero mean and covariance matrix S . Let {q_R (•)}, where 
the subscript R denotes the quantization rate, be a sequence of quantizers designed to achieve the asymptotic distortion rate 
function for Gaussian source TV (0, Si) with < Si e R kxk . Apply q R (•) to Y . If S < Si, then 



2 2R 
lim — — Ey 

fl^oo k 



|Y - qi? (Y )|| 2 



< Cfe (27rSi)W^J . (35) 
Proof: First assume that < S . Let / (y) and /i (y) be the probability density functions for Y and Y x , respectively. 
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Denote Eye ||Y - q R (Y )\\l by D (R). It is clear that 



2 R 

lim —D (R) 

R^oo k 



Ck 



Ck 



fo (y) 



( a m (y)) 
/o(y) 

2/(fe+2) 



2/fc 



dy 



/r 



(y) 



fe/(fe+2) 



(y) dy 



We upper bound the first integral as follows 

fo (y) 



fl ,{k+2) {y) 

_ |2ttSi|^ 
" |2tt£o| 1/2 

(a) |2ttSi|^ 



exp 



--y 



So 1 



fc + 2 



Sr 1 W dy 



|2ttS 



1/2 



|27rS 



1/2 



(b) 

< |27rSi| fc + 2 



/ 



fc 

k + 2 



1/2 



where (a) holds because 



= / /r 2 {x)dx, 



So" 1 



k + 2 



sr 1 



— S„ 1 I Ifc — 



k + 2 



k + 2 



and (b) follows from the assumption S < Si. Substituting (37) into (36), one obtains 

2* 

lim —D (R) 



I* 



Cfc|27rSi|* 



fe/(fe+2) 

k + 2 



(y)dy 

k+2 



k 



(36) 



(37) 



which will be used to prove the upper bounds in (17) and (21). 

Suppose that |So| = (some of the eigenvalues of So are zero). Since So < Si, when e > is sufficiently small, we have 
< S £ := So + el < Si. Let f e (y) be the probability density function of Gaussian vector with zero mean and variance S e . 
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Then, 



2 R 

lim —D (R) 



/o(y) 



\2/fe 

K,i (y)) 

lim/ e (y) 



dy 



(A* feil (y))' 



(c) /* /e (y) 

< Cfehmmf / -— ^ dy 

fc + 2 

< Cfc^TrSil* f — 

where (c) follows from Fatou's lemma [20], and (d) follows from the first part of this proof. This proves the proposition. ■ 
1 ) Lower Bounds for Scalar Quantization: 

We prove the lower bound in (17). Given Assumptions II, each Yi, 1 < i < m, is a linear combination of Gaussian random 
variables, and therefore each Yi is a Gaussian random variable itself. For a given i and a given T, the mean and the variance 
of Yi are E [Yi] = and of T = E [Y t 2 ] = J2jeT Pij' respectively. The variance depends on the row index i and the support 
set T. We calculate the average variance across all rows and all support sets as 




mN ^ 

= —Mi, (38) 

m 

where 

(a) is obtained by exchanging the sums over T and j, 

(b) holds because for any given 1 < j < N, there are (%-{) 

many subsets T containing the index j, 

(c) is due to the fact that (%Z\)/(%) = K / N > 

(d) follows from the definition (15). 

Suppose that one deals with the ideal case: the support set T is known before taking the measurements; and for different 
values of i and T, we are allowed to use different quantizers. Given i and T, we apply the optimal quantizer for the Gaussian 
random variable y^Y i7 so that the quantization distortion of Yi satisfies 



lim 2 2R D* t (R) = 7T ^^ t) V3, 



R— »oo 2 
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which is a direct application of (33) with k = 1. Taking the average over all i and all T gives 

lim if> T [2 2fl D? T (i2)] 

i=l 
1 T " 1 



1 = 1 (/f) 
1 1 m 

vx; i=i t 

where the last equality follows from (38). 

However, the support set T is unknown before taking the measurements. Furthermore, the same quantizer has to be employed 

r ~ 2"i 

for different choices of i and T. Thus, for every R, i and T, Ey; 



K 



Yi-Yi 



> D* T (R). As a result, 







2" 




Ey 


Y- Y 










2 





2 2R 

lim inf Ey 

R^oo K 



lim inf — 



to 

T i=l 



{Vi - ViY 



cylH. 1 

> lim inf Y-YD* T (R) 

\TI T i=l 

Since the above derivation is valid for all if, to and N, the claim in (17) holds. 

The result in (18) for uniform quantizers can be proved using similar arguments. For the ideal case, given i and T, apply 
the optimal uniform quantizer for the standard Gaussian random variable to y^Vi- The corresponding distortion rate function 
for this case was characterized in [19] and s given by 



lim 2 2R D* U ^ T (R) = -al T ln2. 



R—>oo 



Therefore, 



2 2R r r 

lim inf — E T Ey Y-Y 

R-^oo K [ 

> ^iln2, 

which completes the proof of (18). 

2) The Upper Bound for Scalar Quantization: 

By the definition of \ii in (16), the variance of the Gaussian random variable \Pj£Y% is upper bounded by /X2 uniformly 
for all i and all T. For each quantization rate R, we design the optimal quantizer for a Gaussian source with variance [ii and 
apply this quantizer to quantize all components of Y. Using (35), one can show that the quantization distortion for all i and 
T satisfies 

2 2R 

lim sup— — EtEy 

< |m 2 V3, 

which proves the upper bound in (17). 

3) The Lower Bound for Vector Quantization: 

The basic idea for proving the lower bound in (20) is similar to that behind (17). For each T, a lower bound on the 
minimum achievable distortion is derived. The average distortion taken over all the sets T serves as a lower bound of the 
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overall distortion-rate function. 

Suppose the ideal case where we have prior knowledge of T e ('^)- We study the distortion rate function for every given 
T. The measurement vector Y is Gaussian distributed with zero mean and covariance matrix <&t3?T' where 3*t consists of 
the columns of <fr indexed by T. The singular value decomposition of $>t&t gi yes UtAtU t , where Uy e K mxm has 
orthonormal columns and A T = diag (Ai, A 2 , • • • , A m ) is the diagonal matrix formed by the singular values Ai > A 2 > • • • > 
A TO . Note that Ai (<I? t <J?t) = \ (*t3?t) for 1 < i < If. According to Assumption III. 1 , the measurement matrix <1? satisfies 
the RIP with constant parameter 5 K , which implies that I — Sk < \ (<J> T *r) < 1 + 5k for all 1 < i < K. It can be 



concluded that 1 
where Ut k € I 



5k < Ai < 1 + 5k for 1 < i < K and Ai = for K + 1 < i < m. As a result, <&t& t = Ut.xAt.kU 



t.k 



>mxK 



contains the first K columns of Ut and At k € 



vKxK 



is the diagonal matrix formed by the K 



largest singular values. Denote the matrix formed by the last m — K columns of U by K : clearly, Ut = [Ur^Uy K ] . 

The best quantization strategy is to quantize Y = U T K Y so that no quantization bit is used for the "trivial signal" 
(Uy K )*Y. It is clear that Y <~ Af (0, At.k) and < At.k- The corresponding asymptotic distortion rate function is 
therefore 



2 2mR/K , 

lim ~^^D T (R) [ J c K (2kA t , k ) 



K + 2 
K 



> (1 - 6k) (1 + ok (1)) , 

where the 2 2mR / K term comes from the fact that the total quantization rate mR is used to quantize a if-dimensional signal. 
Since this lower bound is valid for all T <E ( ), we have proved the lower bound in (20). 
4) The Upper Bound for Vector Quantization: 

Let e > be a small constant. Let (•)} be a sequence of quantizers that approaches the asymptotic distortion rate 
function for quantizing Y ~ Af (0, (1 + 5k + e) I m )- To prove the upper bound in (21), apply the quantizer sequence {q R (•)} 
to Y. For every T e ( [ ^), Y ~ Af (0, * T * T )- According to the Assumption III. 1, * T * T < C 1 + s k + e) I m - Applying 
Proposition 1, we have 



2 2R 

lim Ey 

r^co m 



|Y-q fl (Y)||^ 
< (1 + 5 K + e)(l + o M (l)). 
The upper bound in (21) is proved by taking the limit e J. 0. 



C. The Existence and Uniqueness o/(x, y) in Equation (28) 
Consider the optimization problem 



min lly — <&tx|| 9 , 



which is equivalent to 







X 




min 


h*T I] 






x,y)GKl T lxK^ 




_ y _ 





(39) 



(40) 



Note that the objective function is convex and the constraint set is convex and closed. The optimization problem (40) has at 
least one solution. Note that the matrix [— <J?t I] does not have full row-rank. Hence, the solution may not be unique: the set 
Q defined in (27) gives all the possible solutions, and is convex and closed. 

Let «p be the projection function from IRl T l x R m to K m , i.e., ((x, y)) = y. Since the set Q is convex, the set <P (Q) is 
also convex. The quadratic optimization problem 



mm 

yeq3(Q) 



Y-y 

has a unique solution. Denote this unique solution by y. Furthermore, recall our assumption that <I>t has full column rank. 
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For any given y e K m , the solution of 



min lly — <&rx|L 



is therefore unique. As a result, there exists a unique x e Rl T l such that (x, y) e Q. This establishes the existence and 
uniqueness of the point (x, y). 
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Fig. 2: Distortion in the reconstruction signals. 



